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Abstract 

Cognitive performance in untreated early onset gender identity disorder (GID) patients might correspond to their bom sex and 
not to their perceived gender. As a current mode of intervention, cross-sex hormone treatment causes considerable physical 
changes in GID patients. We asked, as has been suggested, whether this treatment skews cognitive performance towards that of the 
acquired sex. Somatically healthy male and female early onset GID patients were neuropsychologically tested before, 3 and 12 
months after initiating cross-sex hormone treatment, whereas untreated healthy subjects without GID served as controls (C). 
Perfonnance was assessed by testing six cognitive abilities (perception, arithmetic, rotation, visualization, logic, and verbaliza¬ 
tion), and controlled for age, education, bom sex, endocrine differences and treatment by means of repeated measures analysis of 
variance. GID patients and controls showed an identical time-dependent improvement in cognitive performance. The slopes were 
essentially parallel for males and females. There was no significant three-way interaction of bom sex by group by time for the six 
investigated cognitive abilities. Only education and age significantly influenced this improvement. Despite the substantial somatic 
cross-sex changes in GID patients, no differential effect on cognition over time was found between C and GID participants. The 
cognitive performance of cross-sex hormone-treated GID patients was virtually identical to that of the control group. The 
documented test-retest effect should be taken into consideration when evaluating treatment effects generally in psychiatry. 
© 2005 Elsevier Ireland Ltd. All rights reserved. 
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1. Introduction 

Gender identity disorder (GID) in adults (DSM-IV 
302.85; American Psychiatric Association, 1994) is 
characterized by a discrepancy between objective 
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bom sex and subjective gender identification, 
expressed as a feeling of being bom in the wrong 
sex. Although both biological and psychological mod¬ 
els have been proposed (Baker, 1969; Hoenig and 
Kenna, 1974; Person and Ovesey, 1974a,b; Stoller, 
1979, 1985; Person and Ovesey, 1983; Blanchard, 
1989; Johnson and Hunt, 1990; Brems, 1993; Blan¬ 
chard et al., 1995; Zhou et al., 1995; Cohen and 
Ruiter, 1997), there is no established aetiology for 
the G1D syndrome (Cohen-Kettenis and Gooren, 
1999; Michel et al., 2001). Regardless of etiological 
controversy, evidence has been presented that the 
cognitive performance of GID patients 1 might be 
skewed towards that of their subjective gender 
(Cohen-Kettenis et al., 1998). Such skewing might 
originate from prenatal or postnatal hormonal influ¬ 
ences. In favor of the fonner, a prenatal organizational 
effect of sex hormones on cognitive performance in 
early onset GID patients has been postulated from 
studies of prenatal endocrine disorders such as con¬ 
genital adrenal hyperplasia (CAH). In some of these 
studies, adolescent female CAH patients display a 
cognitive performance skewed towards that of healthy 
males. So far, enhanced spatial ability (Resnick et al., 
1986), a lower verbal and performance IQ and lower 
perceptual speed scores (Nass and Baker, 1991; 
Hampson et al., 1998), as well as enhanced right 
hemisphere development (Nass et al., 1987; Kelso et 
al., 1999), have been described. These cognitive dif¬ 
ferences of female CAH patients would be hypothe¬ 
tically based on their androgenized prenatal and 
perinatal hormone profile, while their postnatal profile 
is normalized by corticosteroid therapy. On the other 
hand, several other studies have failed to verify differ¬ 
ences in cognitive brain function of women with CAH 
compared with their healthy siblings (McGuire and 
Omenn, 1975; McGuire et al., 1975; Helleday et al., 
1994; Kelso et al., 2000). 

As a current mode of intervention in adult GID 
patients, cross-sex hormone treatment causes consid¬ 
erable somatic changes. In the course of such treat¬ 
ment, several authors have suggested that the 
cognitive brain function of adult early onset GID 
patients might be activated towards that of the sub¬ 
jective gender, thus paralleling the endocrine and 
somatic changes observed during treatment (Miles et 
al., 1998; Slabbekoom et al., 1999; Van Goozen et al., 
1994). Hence, androgen-treated female GID patients 


have been reported to improve in cognitive tasks 
generally favoring males [“mental rotation” (Van Goo¬ 
zen et al., 1994; Slabbekoom et al., 1999)], but dete¬ 
riorate in tasks favoring females [“verbal fluency” 
(Van Goozen et al., 1994)]. Conversely, male GID 
patients treated with estrogen reportedly showed a 
decrease in their performance on tasks favoring 
males [“mental rotation” (Slabbekoom et al., 1999)], 
and an improvement in tasks favoring females 
[“verbal memory”(Miles et al., 1998)]. However, 
such an activating effect could not be replicated in a 
later study (Van Goozen et al., 2002). 

The evidence for a hypothetically activating effect 
of cross-sex hormones in GID patients would appear 
to have support in cognitive studies of sex hormones 
substituted to elderly healthy males (Janowsky et al., 
1994; Carlson et al., 1999; Maki et al., 2001), as well 
as postmenopausal women (Sherwin, 1988, 1997; 
Phillips and Sherwin, 1992; Yaffe et al., 2000) and 
female patients with dementia of the Alzheimer type 
(AD), in whom an improvement of cognitive perfor¬ 
mance has been reported (Henderson et al., 1996; 
Tang et al., 1996; Kawas et al., 1997). However, 
other studies of either healthy elderly females (Hack- 
man and Galbraith, 1976; Ditkoff et al., 1991) or AD 
females have failed to observe such improvement 
(Henderson et al., 2000; Mulnard et al., 2000; Wang 
et al., 2000). Furthermore, some evidence for the 
activating hypothesis of cross-sex hormone is derived 
from studies of female cognitive function during dif¬ 
ferent phases of their menstrual cycle (Hampson and 
Kimura, 1988; Saucier and Kimura, 1998), as well as 
from studies of testosterone fluctuations and their 
correlation to different cognitive functioning in men 
(Christiansen and Knussmann, 1987; Moffat and 
Hampson, 1996). However, other studies failed to 
support these observations (Gordon et al., 1986; 
McKeever et al., 1987). 

The divergent results of the cognitive studies that 
address hormonal effects on brain function may be 
partly explained by the type of cognitive functions 
that were studied, the neuropsychological tests that 
were used, and the investigated confounders that were 
analyzed (e.g., education, health, mood) (Barrett-Con- 
nor and Kritz-Silverstein, 1993; McKeever, 1995; Teri 
et al., 1997; Wisniewski, 1998; Yaffe et al., 1998; 
Berenbaum, 2001; LeBlanc et al., 2001). For example, 
formal education and physical health status have been 
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found negatively correlated to the cognitive function 
of demented patients (Breteler et al., 1992; Keefover, 
1996). Moreover, the statistically significant improve¬ 
ment over time in previous studies of retested patients 
may at least partly be related to the test/retest effect of 
repeated testing rather than to hormone administration 
(McCaffrey et al., 1993). 

In addition to problems associated with methods 
of analysis, results can also be affected by differen¬ 
tial sex-specific cellular responses to a given sex 
hormone. In other words, male and female cells 
may respond differently to a sex hormone depending 
on sex chromosome-encoded differences (Rimanoczy 
et al., 2001; Vathy, 2001). Furthermore, the possible 
estrogen-independent activation of intracellular estro¬ 
gen receptors in brain tissue would make the aetiol¬ 
ogy of a hormone-responsive cognitive function even 
more complex (Ciana et al., 2003). Finally, the cog¬ 
nitive ability may also be influenced by an interplay 
between complex “biopsychosocial” factors and not 
only by sex (Breedlove, 1997). 

We have recently described that the cognitive per¬ 
formance of young untreated early onset GID patients, 
before cross-sex hormone treatment corresponds to 
their bom sex instead of their subjective gender (Har¬ 
aldsen et al., 2003). Our data therefore did not support 
the recently published hypothesis of a differently 
organized cognitive function in early onset GID 
patients (Cohen-Kettenis et al., 1998; Van Goozen et 
al., 2002). The purpose of this study was to test 
whether cross-sex hormone treatment of early onset 
GID patients would shift their cognitive performance 
towards that of their subjective gender over a 1-year 
period of therapy. 


2. Methods 

2.1. Subjects 

This study included 52 somatically healthy early 
onset GID patients who consecutively sought sex 
reassignment surgery (SRS) in Norway from 1996 
to 1998 (GID-N, n= 33, 21 females, 12 males, mean 
age=26.7, S.D. = 5.9 years) or from the freestanding 
private Gender Center, Palo Alto, California, in 1997 
and 1998 (GID-US, n = 19, 9 females, 10 males, 
mean age = 35.2, S.D.= 10.0 years). All patients 


were evaluated according to the Harry Benjamin 
International Gender Dysphoria Association’s Stan¬ 
dards of Care (1990 and 1998; Levine et al., 1998). 
Furthermore, the patients underwent two independent 
comprehensive evaluations by two senior psychia¬ 
trists (one of them is the first author), who belong 
to the National Center of Expertise for evaluating 
GID patients. The National Center has existed for 
more than 40 years in Norway. In the US, the same 
procedure was used including the first author as one 
of the three evaluators. Disagreement between the 
evaluators regarding diagnosis led to exclusion of 
such patients from the study. All patients fulfilled 
the criteria for GID according to DSM-1V and the 
Swedish selection criteria for SRS (Walinder et al., 
1977). All included GID patients were diagnosed as 
early onset GID patients, fulfilling criteria A to D in 
DSM-IV from childhood on. They were either homo- 
sexually attracted by males or females (n=38), by 
both (n = 3) or by neither (n = 9) at the time point of 
investigation. Two patients reported heterosexual 
orientation. 

All participants were Caucasians, chromosomally 
and endocrinologically screened, and free of medica¬ 
tion. None of the GID patients had received previous 
cross-sex hormone treatment. Participants with any 
endocrinological, genetic, neurological or major psy¬ 
chiatric comorbidity were excluded [n = 3, from GID-N 
(2 delusional disorders, 1 XXY anomalia)]. Further¬ 
more, the control group members (C) were heterosex¬ 
ual Norwegian participants with no lifetime diagnosis 
of GID. They were either high school graduates, mili¬ 
tary recruits from the armed forces, college students or 
employees of the University of Oslo (n= 29, 15 
females; 14 males, mean age=24.3, S.D. = 10.2 
years). They were recruited by advertisement. 

The age distribution in an independent sample /-test 
did not differ (P= 0.8, mean difference = 0.5 years) 
between females («=45, mean age=28.4, S.D. = 10.3 
years) and males (« = 36; mean age = 27.9, S.D. = 9.2 
years). The age distribution was also equal between 
Control and GID-N groups (P= 0.5, mean differ¬ 
enced.55 years) but differed between Controls and 
the two GID groups (P=0.04, mean difference = — 9.1 
years) because of the older GID-US group. 

The educational level of all participants (n=81) 
was scored and categorized as follows: 1 =high school 
graduates (n=27), 2 = college graduates (n=43), 
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3 =higher university degree (n = 11). Uncorrected one¬ 
way ANOVA revealed significant differences between 
all groups (F= 71.9, P<0.0005). Thus, the lowest 
education level was found in the G1D-N group 
(78.8% with only high school degree) and the highest 
level in the GID-US group (52.6% with university 
degree). There was no difference as to education 
between females and males in the groups (C: F= 0, 
P= 1; G1D-N: F= 1.6, P= 0.2; GID-US: F= 2.6, 
P= 0.1), but a significant difference between males 
and females in the sample as a whole (men were more 
highly educated than females: F= 5.7, P=Q.()2) when 
corrected for the significant influence of age on edu¬ 
cation (F= 4.0, T > =0.03). All participants in this study 
were right-handed. 

2.2. Treatment 

All bom male GID patients (n =22) received 50 |ig 
of oral ethinylestradiol (Etifollin) daily during the first 
3 months of treatment, and thereafter 100 pg daily. All 
born female GID patients (w = 30) received 180 mg 
testosterone enantate (Primoteston-Depot) as an intra¬ 
muscular injection every third week. 

2.3. Neuropsychological testing 

Somatically healthy male and female GID patients 
(« = 52) were neuropsychologically tested 8 to 2 
weeks before, and 3 and 12 months after initiating 
cross-sex hormone treatment, whereas untreated 
healthy male and female subjects without GID 
served as controls («=29). Each test session started 
at 09:00 h and lasted for 3 h with two 15-min breaks 
after the first and second hour. The order of test 
presentation was random and administered by one 
of two trained test assistants to all participants, 
including controls. 

2.3.1. Neuropsychological tests 

Six cognitive factors (rotation, visualization, per¬ 
ception, verbalization, logic, arithmetic) were exam¬ 
ined by a selection of 11 tests in two parts from the 
officially distributed “Kit of factor-referenced cogni¬ 
tive tests” by ETS [Educational Testing Service 
(www.ets.org), licensing agreement signed 1996 
(Ekstrom et al., 1976)], which is methodologically 
based on a factor analysis (Ekstrom et al., 1979). 


All cognitive factors were represented by four tests 
(two forms of a test in two versions) except perception 
(two tests). All the same tests were randomized and 
administrated on the two following sessions after 3 
and 12 months of hormonal treatment. The tests were 
selected to represent cognitive factors, which reveal 
significant sex differences according to earlier studies 
(Van Goozen et al., 1994; Voyer et al., 1995; Cohen- 
Kettenis et al., 1998; Halpern, 2000). Thus, rotation 
and visualization were expected to favor males; per¬ 
ception and verbalization to favor females. Logic and 
arithmetic were expected to be neutral, not being 
associated with bom sex, but instead with other pre¬ 
dictors. Test instructions were given in each patient’s 
mother language. All test sessions started at 09:00 h, 
stopped at 12:30 h, and included breaks. 

Rotation was assessed by means of the card rota¬ 
tion test and the cube comparison test. In the former, 
the subject picks the one out of eight figures that 
represents a mirrored or rotated version of the stimu¬ 
lus figure. Three minutes are available for 10 exer¬ 
cises. In the latter test, a pair of imaged cubes, each 
with a different orientation, is determined to be iden¬ 
tical or not. Three minutes are available for 21 
exercises. 

Visualization was assessed by means of the form 
board test and the paper folding test. In the former, a 
figure must be constructed by assembling up to five 
figures. Eight minutes are available for 24 exercises. 
In the latter test, a picture illustrates a folded piece of 
paper with a punched hole. Out of five additional 
pictures, the subject picks the one that represents the 
identical, but unfolded paper by determining the new 
position of the hole(s). Three minutes are allowed for 
10 exercises. 

Perception was assessed by means of the identical 
pictures test in which 48 figures are evaluated in 1.5 
min. Out of five figures, the subject picks one that is 
identical to the stimulus figure. 

Verbalization was assessed by means of the word 
ending test and the word beginning test. The subject 
writes as many words as possible with a given prefix 
or suffix within 3 min. 

Logic was assessed by means of the nonsense test 
and the diagramming relationship test. In the former 
the subject is asked for the logical reasoning of two 
sentences that leads to the concluding third sentence. 
Four minutes are available for 15 exercises. In the 
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latter test, “Venn diagram” styled geometric figures 
represent the relationship between three words. Five 
different diagrams are presented for each exercise. 
Four minutes are available for 15 exercises. 

Arithmetic was assessed by means of the arithmetic 
aptitude test and the arithmetic operation test. In the 
former, 15 min are available for 15 calculations. 
Results are picked from five alternative answers. In 
the latter test, 15 min are available for 15 calculations in 
which the subject picks the correct arithmetic operation 
required for a given result (e.g., addition, subtraction). 

2.4. Laboratory methods 

Blood samples were drawn during the second break 
at the first test session and for the G1D-US and the G1D- 
N participants at all test situations (e.g., before and after 
3 and 12 months). All serum samples were stored at 
— 20 °C until measured at the Hormone Laboratory, 
Aker University Hospital, Oslo. The blood samples 
obtained from the G1D-US group were stored at —20 
°C and kept frozen on dry ice during shipment to Nor¬ 
way, thus enabling identical analyses of all samples. 

Serum concentrations of estradiol (E 2 ), follicle-sti¬ 
mulating hormone (FSH), luteinizing hormone (LH), 
free thyroxine (FT 4 ), thyroid-stimulating hormone 
(TSH), and prolactin (PRL) were determined by 
time-resolved fluoroimmunoassays (Delfia, Wallac, 
Turku, Finland). Serum concentrations of cortisol, 
progesterone, sex hormone binding globulin (SHBG) 
and testosterone were determined by radioimmunoas¬ 
says (Orion Diagnostica, Espoo, Finland), and so were 
dihydrotestosterone (DHT) and estrone (E,) (in house 
radioimmunoassays). 

The normal ranges for adults established in our 
laboratory were as follows: E2, males 0.08-0.11, 
females 0.08-0.85 nmol/1; FSH and LH, 1-12 IU/1; 
free T4, 8-20 pmol/1; TSH, 0.2-4.5 IU/1; PRL; 50- 
700 mIU/1; cortisol (08.00), 250-750 nmol/1; proges¬ 
terone, males<3, females in follicular phase<3, in 
luteal phase>15 nmol/1; SHBG, males 10-60, 
females 30-90 nmol/1; testosterone, males 8-35, 
females 0.3-2.8 nmol/1; DHT, males 1.40-2.60, 
females 0.35-1.00 nmol/1; and El, males 0.10-0.28, 
females 0.11-0.45 nmol/1. 

The free testosterone index (FT1) was calculated 
(testosterone/SHBG x 10) as an expression of the free 
and biologically active testosterone concentration. 


The normal ranges established in our laboratory 
(«= 926) were as follows: males 2.3-12.8, females 
0 . 1 - 0 . 6 . 

2.5. Statistics 

A series of the repeated measures ANOVAs (Alt¬ 
man, 1991; SPSS 11.0, 2001) were applied. The six 
measured cognitive abilities (perception, verbalization, 
arithmetic, visualization, logic, rotation) served as 
dependent variables. First, the time effect was evalu¬ 
ated. Next, each predictor’s influence was analyzed 
[group (C, G1D-N, GID-US); sex; age (years), and 
education level], and finally, each predictor’s influence 
was adjusted for all others. All interactions were tested. 

Pearson and Spearman correlations were calculated 
between the changes of the endocrine parameters (e.g., 
delta FTI, delta E2) and the changed neuropsychologi¬ 
cal test scores over time (Altman, 1991, SPSS 11.0). 
The significance level was 0.05. The assumptions of 
the analyses were checked and met. 

Of 81 enrolled participants at baseline, 60 subjects 
were included in repeated measures ANOVA because 
of randomly missing values in 21 cases. Replacement 
of missing values could have reduced the variance and 
might have improved the chances of finding signifi¬ 
cant influences. There were no dropouts during the 
study. The calculated Greenhouse-Geisser Epsilon 
significance value is an adjustment used in repeated 
measures when the sphericity assumption is violated. 
Both numerator and denominator degrees of freedom 
must be multiplied by epsilon, and the significance of 
the F ratio evaluated with the new degrees of freedom. 
It tends to be an overly conservative estimate for 
relatively small sample sizes. 

The study was approved by the Human Subjects 
Approval Board at Stanford University, Palo Alto, 
CA. It was also performed according to national 
legislation and institutional guidelines in Norway. 
All participants signed an informed consent fonn. 

3. Results 

3.1. Hormone analysis 

In Table 1 the relevant endocrine data of GID 
patients and controls are summarized. No significant 
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Table 1 

Biochemical characteristics of the study subjects 



GID 






Controls 


Females 



Males 



Females 

Males 

Tl* 

T2* 

T3* 

Tl* 

T2* 

T3* 



DHT ± (nM) 

0.6 (0.4)** 

1.2 (0.8) 

1.8 (0.7) 

1.3 (0.6) 

0.8 (1.0) 

0.6 (0.7) 

0.4 (0.2) 

1.3 (0.3) 

El ± (nM) 

0.4 (0.2) 

0.4 (0.2) 

0.3 (0.1) 

0.3 (0.2) 

0.6 (0.7) 

0.6 (0.7) 

0.3 (0.1) 

0.2 (0.03) 

E2± (nM) 

0.4 (0.2) 

0.2 (0.2) 

0.2 (0.1) 

0.1 (0.08) 

0.2 (0.2) 

0.2 (0.2) 

0.3 (0.3) 

0.1 (0.02) 

FSH ± (IU/1) 

8.0 (12.3) 

7.1 (12.7) 

4.7 (4.1) 

4.7 (3.4) 

2.2 (3.1) 

1.7 (1.6) 

7.5 (6.6) 

3.7 (1.4) 

LH± (IU/1) 

14.8 (15.5) 

7.8 (11.1) 

6.0 (6.3) 

6.0 (3.3) 

2.2 (1.8) 

2.4 (2.7) 

15.0 (23.7) 

4.6 (1.6) 

Progesterone ± (nM) 

10.8 (15.5) 

2.3 (5.1) 

1.4 (1.4) 

1.5 (0.6) 

2.0 (3.0) 

2.4 (3.6) 

10.3 (20.7) 

0.8 (0.4) 

SHGB± (nM) 

48.1 (21.6) 

27.6 (15.6) 

26.3 (16.5) 

32.4 (17.3) 

151.9 (96.8) 

189.8 (100.2) 

75.6 (36.1) 

23.8 (7.1) 

Testosterone ± (nM) 

3.2 (7.6) 

23.3 (11.8) 

29.2 (12.0) 

16.8 (9.7) 

9.0 (14.9) 

6.8 (9.0) 

1.2 (0.6) 

17.4 (4.0) 

FTI± (T/SHBGxlO) 

0.7 (0.4) 

8.4 (4.4) 

11.1 (3.8) 

5.2 (2.0) 

0.6 (0.4) 

0.4 (0.2) 

0.2 (0.2) 

7.3 (1.1) 

Prolactin ± (nM) 

219.0 (84.1) 

220.5 (133.6) 

194.2 (80.8) 

174.3 (87.8) 

213.2 (96.5) 

212.3 (87.5) 

251.9 (70.9) 

142.8 (39.6) 

Cortisol ± (nM) 

348.4 (161.0) 

288.0 (172.2) 

301.2 (114.5) 

473.9 (145.1) 

648.0 (286.3) 

629.3 (249.7) 

316.6 (78.4) 

196.3 (61.8) 


Tl*=baseline testing, T2* = after 3 months, T3*=after 12 months. 
**Mean (± S.D.). 

± Laboratory standard values; see Section 2.4 (Laboratory methods). 


differences were found between the GID patients and 
their sex-matched controls before treatment, except 
for the male GID cortisol levels versus C, probably 
because of high prolactin as an expression of stress. 
Furthermore, the values were normal according to 
laboratory standards. 

In male GID patients, ethinylestradiol treatment led 
to a significant decrease in testosterone levels and a 6- 
fold increase in SHBG levels, causing a reduction of 
FTI from normal male (5.2) to normal female values 
(0.4). Similarly, there was a 50% decrease in DHT. 
Only marginal increases were found in serum levels of 
Ej and E 2 , because the assays are specific for these 
two estrogens and do not detect etinylestradiol. How¬ 
ever, the biological effect of the estrogen treatment 
was clearly demonstrated by the highly significant 
increase in serum SHBG levels in the male GID 
patients. In the female GID patients, testosterone 
treatment led to a significant increase in testosterone 
concentrations from normal female to normal male 
levels, and simultaneously to a pronounced reduction 
in SHBG levels. FTI increased 16-fold from nonnal 
female (0.6) to normal male values (11.1), and DHT 
showed a 3-fold increase. 

The neuropsychological tests of the control and 
study female participants were planned to take place 
during the first 2 weeks of their cycle. The hormone 
measurements, however, showed that 8 of the 45 parti¬ 
cipating females were tested while they were in the 


luteal phase of their cycles (progesterone > 15 nmol/1). 
Nevertheless, the typical cognitive sex differences of 
GID patients were comparable to their sex control 
group at baseline, and are published elsewhere (Har¬ 
aldsen et al., 2003). Male GID patients showed typical 
significant cognitive sex differences favoring them in 
rotation and visualization, and a tendency to perform 
more poorly in perception, compared with female GID 
patients. A similar cognitive sex-different pattern was 
shown for their control sex groups in this study. 

3.2. Repeated measures 

To test the hypothesis that cognitive performance 
would change over time, we first analyzed the neu¬ 
ropsychological data with regard to their temporal 
variation (Table 2). By means of repeated measures 
ANOVA, we found a highly significant improvement 
of performance over time for all cognitive factors (P- 
values T1 vs. T3 and H-values T2 vs. T3 as follows: 
perception F > =0.0005 and P=0.05, arithmetic 0.0005 
and 0.0005, visualization 0.01 and 0.0005, logic 
0.0005 and 0.3, rotation 0.0005 and 0.03, verbaliza¬ 
tion 0.0005 and 0.2). 

Second, to test the influence of each predictor 
(bom sex, group, education, and age), we performed 
an unadjusted repeated measures ANOVA for each of 
them (Table 3). The main effect of the predictor age 
showed that younger subjects achieved significantly 
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Table 2 

Raw scores of the tested cognitive abilities by group, time and sex 


Cognitive factor 

Group 










c 



GID-N 



GID-US 




Tl* «=29 

T2* « = 29 

T3* « =29 

Tl* «=33 

T2* fi = 33 

T3* n = 33 

Tl* 71 = 19 

T2* 71 = 19 

T3* 71 = 19 

Perception 

Female n =45 

41.2 (6.2)** 

44.3 (3.3) 

44.5 (2.2) 

36.1 (8.1) 

38.7 (6.2) 

40.8 (7.1) 

31.1 (8.7) 

32.5 (8.1) 

30.4 (7.2) 

Male n =36 

39.3 (4.4) 

41.5 (5.2) 

43.8 (3.7) 

31.0 (6.0) 

30.2 (6.6) 

30.1 (4.7) 

34.1 (7.9) 

33.9 (7.4) 

37.9 (7.9) 

Arithmetic 

Female 

5.1 (1.2) 

5.2 (1.7) 

5.6 (1.2) 

4.1 (1.8) 

4.4 (2.0) 

5.0 (1.8) 

6.9 (2.8) 

7.0 (3.4) 

7.6 (3.8) 

Male 

5.3 (4.4) 

6.8 (1.5) 

7.4 (1.2) 

4.2 (1.2) 

4.0 (1.2) 

4.4 (1.2) 

7.9 (2.50) 

7.6 (2.1) 

8.2 (2.9) 

Visulaization 

Female 

10.0 (3.4) 

8.8 (2.1) 

9.1 (2.5) 

7.3 (4.5) 

6.8 (3.3) 

8.0 (3.7) 

8.4 (3.9) 

6.9 (2.2) 

6.3 (2.6) 

Male 

11.4 (1.4) 

9.8 (1.2) 

11.4 (1.7) 

8.8 (2.8) 

6.9 (2.7) 

8.4 (3.9) 

12.4 (4.30) 

9.8 (2.8) 

11.0 (2.9) 

Logic 

Female 

6.5 (1.5) 

6.9 (1.3) 

7.0 (1.4) 

5.8 (1.6) 

6.5 (1.4) 

6.5 (2.1) 

7.2 (2.6) 

8.0 (2.0) 

7.7 (2.2) 

Male 

6.8 (3.4) 

7.6 (2.0) 

8.8 (2.3) 

5.7 (1.5) 

6.6 (2.0) 

6.7 (1.3) 

8.9 (2.3) 

9.1 (2.2) 

9.7 (2.5) 

Rotation 

Females 

3.8 (1.9) 

7.8 (2.1) 

9.2 (2.2) 

3.6 (2.4) 

8.7 (3.0) 

8.5 (3.3) 

3.7 (2.3) 

6.8 (2.4) 

6.8 (1.9) 

Males 

5.2 (1.8) 

9.9 (1.2) 

10.9 (1.8) 

4.3 (1.7) 

6.1 (3.1) 

6.2 (3.00) 

5.2 (1.3) 

10.4 (2.5) 

10.6 (2.8) 

Verbalization 

Females 

15.2 (3.4) 

16.2 (2.0) 

17.6 (3.3) 

12.6 (5.7) 

13.8 (5.4) 

14.0 (6.5) 

12.5 (4.7) 

15.0 (1.3) 

12.9 (5.0) 

Males 

15.2 (1.2) 

19.2 (3.3) 

21.0 (4.5) 

1.9 (3.1) 

13.9 (3.6) 

13.1 (4.5) 

14.1 (4.4) 

14.4 (4.9) 

15.1 (5.3) 


Tl*=baseline testing, T2*=afier 3 months, T3* = after 12 months. 
** Mean (±S.D.). 


higher scores than older subjects (Table 3, background 
shadowed). None of the other unadjusted predictors 
influenced the main effect significantly (Table 3), but 
several significant interactions between time and pre¬ 
dictor were found at this step of the analysis (sex x 
time, education x time, age x time, group x time; 
Table 3). There was no significant three-way interac¬ 
tion of bom sex x group x time for the six investigated 
cognitive abilities (Table 3). 

In the third and final step of the repeated measures 
ANOVA, the influence of all predictors’ main effects 
controlled for the others and the interaction with time 
on the cognitive abilities was tested, as well as the 
controlled three-way interaction of bom sex x group 
x time (Table 4). Only the main effect of age signifi¬ 
cantly influenced the test results. Higher age implied 
lower raw scores. In adjusted repeated testing, bom 
sex, education and group showed no significant main 
influence. That means that in repeated neuropsycho¬ 
logical testing, age will decide how well the partici¬ 
pants will score on retesting. In contrast, the 
significant improvements (predictor x time interac¬ 
tion, Table 4) of the cognitive scores were verified 
in the final analysis for five of the six factors (percep¬ 


tion, arithmetic, logic, rotation and verbalization). 
Only age and education were significantly associated 
with these improvements (except group and sex, 
which additionally influenced the improvements of 
the perception tests). Again, younger participants 
were better able to improve their scoring results 
over time than older participants. Participants with 
higher education level showed significantly better 
learning than lower educated ones. Only in percep¬ 
tional tests, did G1D-US and G1D-N participants show 
a larger improvement than the C group, and females a 
larger improvement than males between test sessions 
1 and 2. Nevertheless, this group differences were 
unidirectional, with positive improvements of the 
slopes in all three group categories (GID-US, GID- 
N, C). That means, there were qualitative differences 
in the improvements between groups and sexes. Over¬ 
all, bom sex and group (for 5 of 6 cognitive factors) 
were far away from any influence on the slopes. All 
interaction effects on all cognitive factors between time 
and these two predictors, which were significant when 
not adjusted for each other (Table 3), were completely 
eliminated after adjustment (except group x time and 
group u sex for perception; Table 4). 









Table 3 

Unadjusted repeated measures ANOVA of all participants («=81) 




Perception 

Arithmetic 

Visualization 

Logic 

Rotation 

Verbalization 

Bom sex 

Mauchly's Test of Sphericity 

0.95 

0.95 

0.72 

0.98 

0.79 

0.94 


F, df P 
(Main effect) 

0.66; 1.91; 0.51 

0.37; 1.90; 0.68 

1.70; 1.44; 0.20 

2.93; 1.95; 0.06 

0.19; 1.65; 0.80 

0.47; 1.83; 0.61 


F, df, P 

4.49; 1.00; 0.04“ 

2.10; 1.00; 0.15 

3.03; 1.00; 0.09 

3.17; 1.00; 0.08 

1.33; 1.00; 0.26 

0.15; 1.00; 0.70 

Education b 

(predictor x time) 

Mauchly's Test of Sphericity 

0.96 

0.94 

0.76 

0.97 

0.82 

0.90 


F, df P 
(Main effect) 

0.53; 3.84; 0.71 

0.97; 3.77; 0.42 

2.61; 3.00; 0.06 

0.45; 3.89; 0.80 

1.40; 3.27; 0.25 

1.62; 3.60; 0.18 


F, df P 

0.98; 2.00; 0.40 

18.4; 2.00; <0.0005“ 

4.92; 2.00; 0.01“ 

16.5; 2.00;<0.0005“ 

2.88; 2.00;0.07 

12.1; 2.00; <0.0005 ; 

Age c 

(predictor x time) 

Mauchly's Test of Sphericity 

0.96 

0.96 

0.77 

1.0 

0.83 

0.86 

F,dfP 
(Main effect) 

3.53; 1.92; 0.03 a 

3.35; 1.92; 0.04“ 

3.51; 1.53; 0.05“ 

0.45; 1.9; 0.63 

2.52; 1.66; 0.10 

4.63; 1.71; 0.02“ 



F, df P 

8.4; 1.00; 0.005“ 

15.2; 1.00; 0.0005“ 

0.26; 1.00; 0.61 

4.90; 1.00; 0.03“ 

3.63; 1.00; 0.06 

0.08; 1.00; 0.77 

Group b 

(predictor x time) 

Mauchly's Test of Sphericity 
F, df P 
(Main effect) 

0.96 

0.62; 3.85; 0.70 

0.96 

0.78; 3.82; 0.53 

0.75 

1.29; 3.00; 0.3 

0.97 

0.40; 3.90; 0.80 

0.82 

1.74; 3.27; 0.16 

0.91 

1,96; 3,.62; 0.11 

Bom sex x 

F, df P 

10.8; 2.00; <0.000“ 

14.7; 2.00; <0.0005“ 

3.72; 2.00; 0.03“ 

8.42; 2.00; 0.01“ 

1.25; 2.00; 0.30 

5.67; 2.00; 0.006“ 

group x time 

(predictor x time) 







interaction 

F, df P 

2,24; 3,72; 0,07 

1.58; 3.89; 0.19 

1.90; 2.90; 0.14 

0.57; 3.90; 0.68 

0.90; 1.61; 0.39 

0.55; 3.60; 0.67 


Unadjusted P-values for each predictor’s influence are given for the six cognitive abilities in the ‘Main effect’ lines. 

The ‘predictor x time’ lines apply to interactions. For instance, the P-value for bom sex is 0.51 for perception, while the interaction of bom sex x time is significant (P-value 0.04). 
The analysis is based on repeated measures ANOVA with Greenhouse-Geisser correction. 
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Table 4 

Adjusted repeated measures ANOVA of all participants (n = 81) 




MS** 

Bom 

sex 

Education 


Age** 

Group + 

Group x sex x 
time interaction 

Perception 

F, df, P (Main effect) 

0.95 

0.44; 

1.90; 0.63 

0.77; 3.81; 

0.54 

2.87; 1.97; 0.06 

0.85; 3.81; 0.94 



F, df, P (predictor x 


9.54; 

1.00; 

7.60; 2.00; 


6.50; 1.00; 

12.70; 2.00; 

1.29; 7.39; 0.26 


time controlled) 


0.003 

*** 

0.001*** 


0.014*** 

<0.005*** 


Arithmetic 

F, df, P (Main effect) 

0.92 

0.45; 

1.92; 0.61 

2.15; 3.85; 

0.08 

2.95; 1.92; 0.06 

1.13; 3.85; 0.35 



F, df, P (predictor x 


0.37; 

1.00; 0.55 

3.87; 2.00; 

0.03 

9.10; 1.00; 

0.85; 2.00; 0.43 

1.25; 7.80; 0.28 


time controlled) 






0.004*** 



Visualization 

F, df, P (Main effect) 

0.72 

2.07; 

1.44; 0.15 

1.57; 2.88; 0.21; 

2.04; 1.40; 0.15 

0.61; 2.88; 0.61 



F, df, P (predictor x 
time controlled) 


1.65; 

1.00; 0.21 

2.68; 2.00; 

0.08 

0.75; 100; 0.39 

1.50; 2.00; 0.23 

0,91; 2.92; 0.44 

Logic 

F, df, P (Main effect) 

0.97 

2.59; 

1.94; 0.08 

0.10; 3.88; 

1.00 

0.86; 1.94; 0.42 

0.09; 3.80; 0.98 



F, df, P (predictor x 


0.88; 

1.00; 0.35 

5.62; 2.00; 


1.12; 1.00; 0.28 

0.43; 2.00; 0.65 

0.59; 3.9; 0.66 


time controlled) 




0.006*** 





Rotation 

F, df, P (Main effect) 

0.84 

0.62; 

1.67; 0.51 

0.85; 3.34; 

0.48 

2.78; 1.67; 0.08 

0.60; 3.34; 0.67 



F, df, P (predictor x 


0.41; 

1.00; 0.53 

3.25; 2.00; 


6.0; 1.00; 0.02*** 

0.58; 2.00; 0.57 

1.15; 3.29; 0.34 


time controlled) 




0.05*** 





Verbalization 

F, df, P (Main effect) 

0.85 

0.59; 

1.7; 0.53 

0.27; 3.40; 

0.90 

3.7; 1.70; 0.03;*** 

0.63; 3.40; 0.60 



F, df, P (predictor x 


0.05; 

1.00; 0.82 

6.5; 2.00; 


1.04; 1.00; 0.31 

1.63; 2.00; 0.21 

0.27; 3.40; 0.87 


time controlled) 




0.003*** 






**Mauchly’s Test of Sphericity. 

***Statistically significant differences. 

*Sex, Education (high school, college and university degree), and Group (C, GID-N and GID-US) used as contrasted factors. 

* ’ Age continually distributed, used as covariate; each predictor is controlled for the others. 

Adjusted /’-values for each predictor’s influence are given for the six cognitive abilities in the ‘Main effect’ lines controlled for the others. 
The ‘Predictor x time’ lines apply to interactions. For instance, the /’-value for bom sex is 0.63 for perception while the interaction of bom 
sexxtime is significant ( /’-value 0.003). 

The analysis is based on repeated measures ANOVA with Greenhouse-Geisser correction. 

Used model (Predictors: fixed factor bom sex; group, education; as covariate served age. All controlled for each other. As dependent served each 
cognitive factor). 


3.3. Hormonal and cognitive changes over time 

In the fourth step of data analysis, we investigated 
the relationship between hormone serum levels and 
cognition. There were no significant Pearson correla¬ 
tions between any of the endocrinological parameters 
(e.g., Delta LH, FSH) and the change of any of the 
neuropsychological test results (e.g., for example 
Pearson correlation for rotation = 0.11, P=0.54). 
Plasma hormone levels may not be normally distrib¬ 
uted (although in our study the assumptions of nor¬ 
mality of all delta values were not rejected). 
Therefore, we also performed Spearman correlations, 
but the results did not differ. 

The “male” and “female” cognitive factors showed 
a development similar to the sex-insensitive “control” 
cognitive factors, despite different gender identifica¬ 
tions or sex. Instead, only age significantly influ¬ 
enced the main performance of the repeatedly 


tested cognitive function. Furthermore, age and edu¬ 
cation significantly predicted the learning ability of 
the participant. 

4. Discussion 

In our recently published analysis of untreated 
healthy early onset GID patients, we showed that 
bom sex predicts the cognitive pattern of GID patients 
and that of their bom sex control group in a similar 
fashion (Haraldsen et al., 2003). Despite their differ¬ 
ent gender identity, all male participants scored sig¬ 
nificantly higher in rotation and visualization than 
females. Although a significant female advantage in 
verbal fluency and perception was not verified, a 
tendency could be documented for the latter. 

In the present study we show that cross-sex hor¬ 
mone treatment also fails to change cognitive function 
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in early onset GID patients. The cognitive ability 
remained similar to that of their bom sex control 
group, a finding that is in strong contrast to the 
somatic and endocrine changes observed with such 
treatment. Instead, we demonstrated a positive, uni¬ 
directional learning effect in all participants that we 
interpret as a test/retest effect. During 1 year of cross¬ 
sex hormone treatment, all group categories (GID-N, 
G1D-US, C) and both sexes showed a parallel impro¬ 
vement of their test scores (Table 4). There was no 
significant three-way interaction of bom sex x group 
x time for the six investigated cognitive abilities 
(Tables 3 and 4). That implies, for instance, that a 
potential interaction between group and time does not 
depend significantly on sex. Our data demonstrate that 
the slope of the learning curve depended significantly 
on the interaction between time and age, as well as 
between time and education level. All other predictors 
had a parallel impact on the overall unidirectional, 
positive test score improvements (Table 4). Younger 
and higher educated participants “learned bettef’ than 
older participants. Our findings therefore fail to sup¬ 
port two of the few earlier publications in this field 
(Van Goozen et al., 1994; Slabbekoom et al., 1999), 
which reported an increase of the mental rotation 
factor scores over time, explained by cross-sex hor¬ 
mone treatment. In addition, one of these studies (Van 
Goozen et al., 1994) showed deteriorated verbal flu¬ 
ency scores in female GID patients treated with andro¬ 
gens. We were unable to replicate any of these results. 
Furthermore, our results contradict findings of a nega¬ 
tive “short-term” influence on spatial abilities over a 
3-month period in estrogen-treated male GID patients 
(Slabbekoom et al., 1999). On the other hand, we 
partly confirm two earlier studies, which showed no 
effect of estrogen treatment on spatial ability in male 
GID patients (Miles et al., 1998; Van Goozen et al., 
2002) and no effect of testosterone treatment in female 
GID (Van Goozen et al., 2002). Miles et al. (1998) 
also assessed estrogen sensitivity by showing an effect 
on verbal memory in male GID patients treated with 
estrogens. We used a different verbalization test to 
assess estrogen sensitivity (verbal fluency), which did 
not permit direct comparison. 

The different results might be partly explained by 
the chosen statistical approach. To specify the effects 
of various possible confounders (age, bom sex, edu¬ 
cation, endocrine changes) on repeated neuropsycho¬ 


logical testing in GID patients, we took advantage of 
recruiting young subjects from a homogenous public 
health care system (GID-N) and older subjects from a 
private, insurance company financed health care sys¬ 
tem, the latter also having a higher mean socioeco¬ 
nomic status (GID-US). Although a confounder does 
not show a significant correlation with the test result, 
it might be identified as significant in later analysis in 
combination with other confounders (Altman, 1991). 
We therefore included all variables of interest in the 
final analysis. Interestingly, although we observed 
group x time interactions in the second step of analy¬ 
sis (confirming the hypothesis of a different slope 
between the groups based on endocrine changes, 
Table 3), no bom sex x group x time effects were 
found (Table 3). Moreover, these significant differ¬ 
ences disappeared in the final adjusted analysis (Table 
4). Moreover, group differences were only seen in the 
variable perception, but no interaction was found 
between group x time or group x sex x time for any 
of the cognitive factors. This means that the same kind 
of movement with time (a unidirectional improvement 
of the slope) was found in all groups and both sexes. 
Further, it could be argued that the two different GID 
groups do not belong to the same diagnostic groups. 
However, all included GID patients fulfilled early 
onset GID criteria and were almost all homosexually 
orientated. 

Furthermore, in the earlier studies, the endocrine 
changes were not monitored by serum samples. This 
is an advantage of the present study. Here, we were 
unable to find any correlations between the significant 
endocrine changes and the improved test results (e.g., 
Delta Rotation versus Delta Testosterone). Although 
the test/retest effect observed in our study was sig¬ 
nificant, it was relatively small and unlikely to conceal 
any substantial effect of the treatment. Given a com¬ 
monly used significance level of 5% and a power of 
80% (using rotation as an example), we calculated 
that 30 individuals in each sex group are sufficient to 
detect a standardized clinical difference of 0.74 [the 
figure actually observed at baseline, Table 2 (Altman, 
1991)]. The sample size in our study exceeds this 
requirement. Moreover, repeated measurement also 
adds to the power. 

Although our present study had the power to reg¬ 
ister sex and group differences, as well as a potential 
crossover phenomenon, it could be argued that our 
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female cognitive factors did not show significant dif¬ 
ferences at baseline (except a tendency for perception) 
and therefore might be insensitive to “cross-sex 
changes”. Nevertheless, such a cross-sex phenomenon 
has been reported without documenting significant 
sex differences at baseline (Van Goozen et al., 1995; 
Slabbekoorn et al., 1999). 

Interactions between sex and group were not seen in 
our baseline study (Haraldsen et al., 2003). In addition, 
in the present study no temporal interactions could be 
found between sex, group and time for the investigated 
cognitive factors with help of the Repeated Measures 
General Linear Model. However, the significant influ¬ 
ence of bom sex at baseline on some cognitive factors 
might have been missed in repeated measurements 
because of the strong and similar influence of age in 
both sexes on the learning effect, as well as the strong 
influence of age as a main effect on repeatedly mea¬ 
sured cognitive functions. Nevertheless, we documen¬ 
ted only a unidirectional tendency of the differences 
between the sexes (Table 2) and documented no cross¬ 
over phenomenon which would have given significant 
results in the adjusted repeated measures ANOVA of 
group x time or group, sex and time (Table 4). This 
result is supported by the absence of a correlation 
between hormonal and cognitive change (see p. 13). 

Our findings are in line with the negative studies of 
patients receiving hormone replacement therapy in 
deficiency states or of patients with prenatal endocri¬ 
nological diseases (Wisniewski, 1998; LeBlanc et al., 
2001). Despite the methodological problems in 
repeated measures studies of cognitive function, pre- 
natally or postnatally imprinted target cells are re¬ 
influenced by their cognate ligand (McEwen, 1997; 
McEwen et al., 1997). In cross-sex honnone treatment, 
however, the sex hormones act on target cells that have 
not been previously exposed to such high levels of 
these hormones. Furthermore, cross-sex testosterone 
treatment probably increases aromatization of testos¬ 
terone to estrogen, and this may cause even higher 
estrogen levels in the brains of biological females 
(Krey et al., 1982; Westley and Salaman, 1977). More¬ 
over, sexually differentiated regions and target cells 
may interact in a complex fashion with sex steroids to 
maintain sexually dimorphic peptide expression, in 
which case cross-sex hormones treatment would not 
necessarily result in a unidirectional “cross-sex cogni¬ 
tive performance”. 


In conclusion, this study provides strong evi¬ 
dence that the cognitive performance of healthy 
GID patients is resistant to cross-sex honnone treat¬ 
ment, and is therefore in strong contrast to the 
substantial peripheral effects of this treatment. 
Taken together, our previous (Haraldsen et al., 
2003) and the present study show that early onset 
GID patients possess not only the cognitive perfor¬ 
mance pattern of their bom sex before treatment, 
but their cognitive pattern is also resistant to the 
impact of cross-sex hormones. In other words, our 
study failed to support previous studies and pro¬ 
vided no evidence for the role of cross-sex hormone 
treatment in the malleability of adult cognitive brain 
function. In addition, the documented test/retest 
effect is of tremendous importance for cognitive 
research for all patient groups and should be taken 
into consideration when evaluating treatment effects 
generally in psychiatry. 
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